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METHOD AND APPARATUS FOR DECODING HYBRID 
INTRA-INTER CODED BLOCKS 

CROSS-REFERENCE TO RELATED APPLICATION 

5 This application claims the benefit of U.S. Provisional Application Serial No. 

60/497,816 (Attorney Docket No. PU030258). filed August 26. 2003 and entitled 
"METHOD AND APPARATUS FOR HYBRID MACROBLOCK MODES FOR VIDEO 
CODECS", which Is Incoiporated herein by reference In Its entirety. Furthemiore, 
the application Is closely related to a U.S. non-provisional Application entitled 

10 /'METHOD AND APPARATUS FOR ENCODING HYBRID INTRA-INTER CODED 
BLOCKS", concurrentlyflled on August XX, 2004 (Attorney Docket No. PU030258). 

FIELD OF THE INVENTION 

The invention relates generally to digital video CODECs, and more particulariy 
15 to the hybrid use of both Intra and inter coding for macroblocks. 

BACKGROUND OF THE INVENTION 

A video encoder can be used to encode one or more frames of an Image 
sequence into digital information. This digital Information may then be transmitted to a 

20 receiver, where the Image or the Image sequence can then be re-constructed 
(decoded). The transmission channel Itself may Include any of a number of possible 
channels for transmission. For example, the transmission channel might be a radio 
channel or other means for wireless broadcast. coa)dal Cable Television cable, a 
GSM mobile phone TDMA channel., a fixed line telephone link, or the Internet This 

25 list of transmission means Is only Illustrative and Is by no means meant to be all- 
inclusive. 

Various international standards have been agreed upon for video encoding 
and transmission, in general, a standard provides rules for compressing and 
encoding data relating to frames of an image. These rules provide a way of 
30 . compressing and encoding image data to transmit less data than the viewing camera 
originally provided about the image. This reduced volume of data then requires less 
channel bandvrtdth for transmission. A receiver can re-construct (or decode) the 
image from the transmitted data if It knows the rules that the transmitter used to 
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perform the compression and encoding. Tlie H.264 standard minimizes redundant 
transmission of parts of the image, by using motion compensated prediction of 
macrobiocl^s from previous frames. 

Video compression architectures and standards, such as MPEG-2 and 
JVT / H.264 /IVIPEG 4 Parti 0/AVC, encode a macroblock using only either an 
intraframe ("intra") coding or an. interframe ("inter") coding method for the encoding of 
each macrobloclc. For Interframe motion estimation/compensation, a video frame to 
be encoded is partitioned into non-overlapping rectangular, or most commonly, 
square blocks of pixels. For each of these macroblocks, the best matching 
macroblock Is searched from a reference frame In a predetermined search window 
according to a predeterhiined matching error criterion. Then the matched macroblock 
is used to predict the current macroblock, and the prediction error macroblock is 
further processed and transmitted to the decoder. The relative shifts in the horizontal 
and vertical directions of the reference macroblock with respect to the original 
macroblock are grouped and referred to as the motion vector (MV) of the original 
macroblock, which is also transmitted to the decoder. The main aim of motion 
estimation is to predict a macroblock such that the difference macroblock obtained 
from taking a difference of the reference and current macroblocks produces the 
lowest number of bits in encoding. 

For intra coding, a macroblock (MB) or a sub-macroblock within a picture is 
predicted using spatial prediction methods. For inter coding, temporal prediction 
methods (I.e. motion estimation/compensation) are used. Generally, Inter prediction 
(coding) methods are usually more efficient than intra coding methods. In the existing 
architectures/standards, specific picture or slice types are defined which specify or 
restrict the intra or inter MB types that can be encoded for transmission to a decoder. 
In intra (I) pictures or slices, only intra MB types can be encoded, while on Predictive 
(P) and Bi-predictive (B) pictures or slices, both intra and inter MB types may be 
encoded. 

An l-picture or l-slice contains only intra coded macroblocks and does not 
use temporal prediction. The pixel values of the current macroblock are first spatially 
predicted from their neighboring pixel values. The residual information is then 
transformed using a NxN transform (e.g., 4x4 or 8x8 DCT transform) and then 
quantized. 
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B-pictures or B-sllces, introduce the concept of bi-predictive (or in a 
generalization multipie-prediction) inter coded macroblocl< types, whiere a macrobiocl^ 
(IVIB) or sub-bioci< is predicted by two (or more) interframe predictions. Due to bi- 
prediction, B pictures usually tend to be more efficient in coding than both I and P 
pictures. 

A.P-picture or B-picture may contain different slice types, and macroblocks 
encoded by different methods. A slice can be of I (Intra), P (Predicted), B (Bl- 
predicted), SP (Switching P), and SI (Switching I) type. 

Intra and Inter prediction methods have been used separately, within video 
coding architectures and standards such as MPEG-2 and H.264. For intra coded 
macroblocl^s, available spatial samples within the same frame or picture are used to 
predict cunrent macroblocks, while In inter prediction, temporal samples within other 
pictures or other frames, are instead used. In the H.264 standard, two different intra 
coding modes exist: a 4x4 intra mode which performs the prediction process for every 
4x4 biocl< within a macroblodl<; and a 16x16 intra mode, for which the prediction Is 
performed for the entire macrobiock in a single step. 

Each frame of a video sequence is divided into so-called "maa-obiocks", 
whjch comprise luminance (Y) information and associated (potentially spatially sub- 
sampled depending upon the color space) chrominance (U. V) information. 
Macnsblocks are formed by representing a region of 16x16 Image pixels in the 
original image as four 8x8 blocks of luminance (luma) infomnation. each luminance 
block comprising an 8x8 anray of luminance (Y) values; and two spatially 
conresponding chrominance components (U and V) which are sub-sampled by a 
factor of two in the horizontal and vertical directions to yield corresponding arrays of 
8x8 chrominance (U, V) values. 

In 16x16 spatial (intra) prediction mode the luma values of an entire 16x16 
macrobiock are predicted from the pixels around the edges of the MB. In the 16x16 
Intra prediction mode, the 33 neighboring samples immediately above and/or to the 
left of the 16x16 luma block are used for the prediction of the current macrobiock, and 
that only 4 modes (0 vertical, 1 horizontal, 2 DC, and 3 plane prediction) are used. 

FIG 1 Illustrates the intraframe (intra) prediction sampling method for the 4x4 
Intra mode In the H.264 standard of the related art. The samples of a 4x4 luma block 
110 to be intra encoded containing pixels "a" through "p" in FIGl 1 are predicted using 
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nearby pixels "A" through "M" In FIG. 1 from neighboring blocks. In the decoder, 
samples "A" through "M" from previous macroblocI<s of the same picture/frame 
typically have been already decoded and can then used for prediction of the current 
macroblock 110. 

5 FIG. 2 illustrates, for the 4x4 luma block 110 of FIG.1 the nine intra prediction 

modes labeled 0, 1,3, 4, 5, 6, 7, and 8. Mode 2 is the 'DC-prediction'. The other 
modes (1, 3. 4, 5, 6, 7, and 8) represent directions of predictions as indicated by the 
.. arrows in FIG. 2. 

The intra macroblock types that are defined In the H.264 standard are as 
10 follows: 
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Table 1 - Intra Macroblock types 



mb_type 


Name of mbjtype 


MbPaiiPredMode 
(mbjtype, 0) 


Intral6xl6 
PredMode 


CodedBlock 
PatternChro 


CodedBlock 
PatternLuma 


0 


1.4x4 


Intraj4x4 


NA 


NA 


NA 


1 


I„16xl6_0J)J) 


Intra_16xl6 


0 


0 


0 


2 


IJ6xl6_l J)J) 


Intra_16xl6 


1 


0 


0 


3 


I_l6xl6jlj>_0 


Intra_16xl6 


2 


0 


0 


4 


I_16xl6„3_0_0 


Inti:a_16xl6 


3 


0 


0 


5 


L16xl6J)_lJ) 


Intra_16xl6 


0 


1 


0 


6 


L16xl6_l_l_0 


Intra__16xl6 


1 


1 


0 


7 


L16xl6_2_l_0 


Intra_16xl6 


2 


1 


0 


8 


I_16xl6_3_l_0 


Intra_16xl6 


3 


1 


0 


9 


I_16xl6_0_2_0 


Intra_16xl6 


0 


2 


0 


10 


I_16xl6_l_2_0 


Intra 16x16 


1 


2 


0 


11 


L16xl6_2_2_0 


Intra_16xl6 


2 


2 


0 


12 


L16xl6_3_2„0 


Intra_16xl6 


3 


2 


0 


13 


I_16xl6_0_0J 


Intra_16xl6 


0 


0 


15 


14 


I_16xl6_l_0^1 


Intra_16xl6 


1 


0 


15 


15 


L16xl6_2_0J 


Intra_16xl6 


2 


0 


15 


16 


L16xl6_3J)_l 


Intra_16xl6 


3 


0 


15 


17 


I_16xl6_0_l_l 


Intra_16xl6 


0 


1 


15 


18 


I_16xl6_l_lJ 


Intra_16xl6 


1 


1 


15 


19 


I„16xl6_2_lJ 


Intra_16xl6 


2 


1 


15 


20 


L16xl6_3_l„l 


Intra_16xl6 


3 


1 


15 


21 


I_16xl6_0_2J 


Intra_16xl6 


0 


2 


15 


22 


M6xl6_l_2J 


Intra_}6xl6 


1 


2 


15 


23 


1.16x1 6_2J2 J 


Intra_16xl6 


2 


2 


15 


24 


L16xl6_3JtJ 


Intra_16xl6 


3 


2 


15 


25 


IJ>CM 


NA 


NA 


NA 


NA 



FIG. 3 depicts a current macroblock 310 to be Inter coded in a P-frame or P- 
slice using temporal prediction, instead of spatial prediction, by estimating a motion 
vector (i.e., MV, Motion Vector) between the best match (BM) among the blocks of 
two pictures (301 and 302). In inter coding, a current block 310 in the current frame 
301 is predicted from a displaced matching block (BM) in the previous frame 302. 
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Every inter coded block (e.g., 310) is associated with a set of motion parameters 
(motion vectors and a reference index refjdx), wliich provide to tlie decoder a 
corresponding location within the reference picture (302) associated with refjdx from 
which ail pixels in the blocl< 310 can be predicted. The difference between the 
original block (310) and its prediction (BM) is compressed and transmitted along with 
the displacement motion vectors (MV). Motion can be estimated independently for 
either 16x16 macroblock or any of Its sub-macroblock partitions: 16x8, 8x16, 8x8, 8x4, 
4x8, 4x4. An 8x8 macroblock partition Is known as a sub-macroblock (or subblock). 
Hereinafter, the terni "block" generally refers to a rectangular group of adjacent pixels 
of any dimensions, such as a whole 16 x 16 macroblock and/or a sub-macroblock 
partition. Only one motion vector (MV) per sub-macroblock partition Is allowed. The 
motion can be estimated for each macroblock from different frames either In the past 
or In the future, by associating the macroblock with the selected frame using the 
macroblock's refjdx. 

A P-slice may also contain intra coded macroblocks. The intra coded 
macroblocks within a P-slice are compressed In the same way as the Intra coded 
macroblocks In an i-sllce. Inter coded blocks are predicted using motion estimation 
and compensation strategies. 

If all the macroblocks of an entire frame are encoded and transmitted using 
intra mode, it Is referred to as transmission of an 'INTRA frame' (l-Frame or l-Plcture). 
An INTRA frame therefore consists entirely, of intra macroblocks. Typically, an INTRA 
frame must be transmitted at the start of an image transmission, when the receiver as 
yet holds no received macroblocks. If a frame is encoded and transmitted by 
encoding some or all of the macroblocks as inter macroblocks, then the frame is 
refen-ed to as an 'INTER frame'. Typically, an INTER frame comprises less data for 
transmission than an INTRA frame. However, the encoder decides whether a 
particular macroblock Is transmitted as an intra coded macroblock or an Inter coded 
macroblock, depending on which IS most efficient. 

Every 16x16 macroblock to be Inter coded In a P-slice may be partitioned into 
16x8, 8x16, and 8x8 partitions. A sub-macroblock may itself be partitioned Into 8x4, 
4x8, or 4x4 sub-macroblock partition. Each macroblock partition or sub-macroblock 
partition in H.264 is assigned to a unique motion vector. Inter coded Macroblocks 
and macroblock partitions have unique prediction modes and reference indices. It is 
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not allowed In the current H.264 standard for inter and Intra predictions to be selected 
and mixed together in different partitions of the same macroblock. In the H.264/AVC 
design adopted in February 2002, the partitioning scheme Initially adopted from 
Wiegand et al Included support of switching between intra and inter on a sub- 
5 macroblock (8x8 luma with 4x4 chroma) basis. This capability was later removed in 
order to reduce decoding complexity. 

In P-pictures and P-slices, the fdllowing additional block types are defined: 



Table 2 - Inter Macroblock types for P slices 



mb^type 


Name of 
mb^type 


NumMbPart 
( mb^type ) 


MbPartPredMode 
( mbjtype, 0 ) 


MbPartPredMode 
( mb.type, 1 ) 


MbPartWidth 
( nib_type ) 


MbPartHeight 
( mb jtype ) 


0 


PJL0J6xl6 


1 


PredJLO 


NA 


16 


16 


1 


P_L0_L0_16x8 


2 


PredJLO 


PredJLO 


16 


8 


2 


P_L0JL0_8xl6 


2 


PredJLO . 


PredJLO 


8 . 


16 


3 


P_8x8 


4 


NA 


NA 


8 


8 


4 


P_8x8ref0 


4 


NA- 


NA 


8 


8 


Inferred 


P_Skip 


I 


Pred_LO 


NA 


16 


. 16 



10 . FIG 4 illustrates the combination of two (temporal) predictions for inter coding 

a macroblock in a B-Picture or B-Siice.. 

As illustrated in FIG, 4, for a macroblock 410 to be inter coded within B- 
pictures or B-slices, instead of using only one "Best Match" (BM ) predictor 
(prediction) for a current macroblock, two (temporal) predictions (BMLO and BML1) 

15 are used for the current macroblock 410, which can be averaged together to form a 
final prediction. In a B-picture or B-slice, up to two motion vectors (MVLO and 
MVL1), representing two estimates of the motion, per sub-macroblock partition are 
allowed for temporal prediction. They can be from any reference pictures (List 0 
Reference and List 1 Reference), subsequent or prior. The average of the pixel 

20 values In the Best Matched blocks (BMLO and BML1) in the (List 0 and List 1) 
reference pictures are used as the predictor. This standard also allows weighing the 
pixel values of each Best Matched block (BMLO and BML1) unequally, instead of 
averaging them. This is referred to as a Weighted Prediction mode and is useful in 
the presence of special video effects, such as fading. A B-slice also has a special 

25 mode - Direct mode. The spatial methods used In MotionCopy skip mode, and the 
Direct mode are restricted only on the estimation of the motion parameters and not of 
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the macroblocks (pixels) themselves, and no spatially adjacent samples are used. 
In Direct mode the motion vectors for a macroblock are not explicitly sent. 

The following macroblock types are defined for use in B-pictures and B-slices: 

5 Table 3- Inter Macroblock types for B slices 



mb_type 


Name of 
taib_type • 


NumMbPart 
( mb_type ) 


MbPartPredMode 
( mb^type, 0 ) 


MbPartPredMode 

( mb^type, 1 ) 


MbPartWidth 
( mb_type ) 


MbPartHeight 
{ mb_type ) 


0 


B_DirectJ6xl6 


NA 


Direct 


• NA 


8 


8 


1 


BJ-0_16xl6 


1 


PredJLO 


NA 


16 


16 


2 


BJLl_16xl6 


1 


PredJLl 


NA 


16 


16 


3 


B_BM6xl6 


1 


BiPred 


NA 


16 


16 


4 


BJX)jL0_16x8 


2 


PredJ-O 


PredJLO 


16 


8 


. 5 


B_L0_L0_8xl6 


2 


PredJX) 


PredJLO 


8 


16 


6 


B_Ll_Ll_16x8 


2 


Pred_Ll 


PredJLl 


16 


8 


7 


B_Ll_Ll_8xl6 


2 


Pred^Ll 


PredJLl 


8 


16 


8 


B_L0_Ll_16x8 


2 


Pred^LO 


PredJLl 


16 


8 


9 


B_L0_Ll_8xl6 


2 


Pred^LO 


PredJLl 


8 


16 


- 10 


B_Ll_L0_16x8 


2 


Pred^^Ll 


Pred_LO 


16 


8 


11 


B_Ll_L0_8xl6 


2 


Pred^Ll 


Pred_LO 


8 


16 


12 ' 


B_L0_BiJ6x8 


2 


Pred_L0 


BiPred 


16 


8 


13 


B_L0_Bi_8xl6 


2 


Pred^LO 


BiPred 


8 


16 


14 


B_Ll_BL16x8 


2 


PredJLl 


BiPied 


16 


8 


15 


B_Ll_BL8xl6 


- 2 ' 


Pred_Ll 


BiPred 


8 


16 


16 


B_BiJL0J6x8 


2 


BiPred 


PredJLO 


16 . 


8 


17 


B_BLL0_8xl6 


2 


BiPred 


PredJLO 


8 


16 


18 


B_BLLl_16x8 


2 


BiPred 


Pred_Ll 


16 


8 


19 


BjaLLl_8xl6 


2 


BiPred 


PredJLl 


8 


16 


20 


B_BLBL16x8 


2 


BiPred 


BiPred 


16 


8 


21 


B_BLBi_8xl6 


2 


BiPred 


BiPred 


8 


16 


22 


B.8x8 


4 


NA 


-NA 


. 8 


8 


inferred 


B_Skip 


NA 


Direct 


NA 


8 


8 



In B-slices, as shown in the above table, the two temporal predictions are 
always restricted to using the same block type. 

Deblocking filters, and Overlapped Block Motion Compensation (OBMC) use 
10 some spatial correlation. According to these methods, the reconstructed pixels, 
after prediction and the addition of the associated residual, are spatially 
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processed/filtered depending upon their mode (intra or inter), position (MB/blocl< 
edges, internai pixels etc), motion information, associated residual, and tiie 
surrounding pixel difference. This process can considerably reduce blocking 
artifacts and improve quality, but on the other hand can also increase complexity 
5 considerably (especially within the decoder). This process also may not always 
yield the best results and it may itself introduce additional blurring on the edges. 

SUMMARY OF THE INVENTION 

Existing video compression standards (e.g., l\/IPEG-2 and H.264) do not allow 
10 both intraframe (intra) and interframe (Inter) predictions to be combined together (like 
the combination of two interframe predictions in the inter-only bi-predlction) for 
encoding a current macroblock or a subblock. In accordance with the principles of 
the present invention, provision is made for the combination of intra predictions and 
inter predictions in the encoding and decoding of a given macroblock, subblock, or 
15 partition. The combination of intra and .inter predictions enables improved gain 
and/or encoding efficiency and/or may further reduce video data error propagation. 

An embodiment of the invention provides for decoding a hybrid intra-inter 
encoded block by combining a first prediction of a current block with a second 
prediction of a current block; wherein the first prediction of the cunnent block is intra 
20 prediction and the second prediction of the current block is Inter prediction. 

Throughout the following description It will be assumed that the luminance 
(luma) component of a macroblock comprises 16x16 pixels arranged as an array of 4 
8x8 blocks, and that the associated chrominance components are spatially sub- 
sampled by a factor of two in the horizontal and vertical directions to form 8x8 blocks. 
25 Extension of the description to other block sizes and other sub-sampling schemes will 
be apparent to those of ordinary skill in the art. The invention is not limited by the 
16x16 macroblock structure but can be used in any segmentation based video coding 
system. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The above features of the present invention will become more apparent by 
describing in detail exemplary embodiments thereof with reference to the attached 
drawings in which: 

5 FIG. 1 shows the samples near a 4x4 pixel luma block to be intra coded, in 

accordance with the H.264 standard; 

FIG. 2 Illustrates, the nine directions of predictions encoding for the 4x4 block 
of FrG.1. in accordance with the H.264 standard; 

FIG. 3 depicts a macrobksck being inter coded by estimating a motion vector, 
10 In accordance with the H.264 standard; 

FIG. 4 Illustrates the Bi-prediction of a macroblock by combining two inter 
codings, in accordance with the H.264 standard; 

FIG. 5 depicts intra-inter hybrid bi-prediction of a 4x4 block combining inter 
and intra prediction, in accordance with the principles of the Invention; 
15 FIG. 6 is a block diagram illustrating a video encoder and a video decoder, in 

accordance with the principles of the invention; 

FIG. 7, is a block diagram illustrating a video encoder, In accordance witii the 
principles of the invention; 

FIG. 8, is a block diagram illustrating a video decoder, in accordance with the 
20 principles of the invention; and 

FIGs. 9A and 9B are block diagrams illustrating circuits for combining intra 
and inter predictions in the encoder of FIG. 7 or the decoder of FIGi 8. 

DETAILED DESCRIPTION OF THE INVENTION 

25 FIG. 5 depicts an example of hybrid intra-inter bi-prediction where the same 

4x4 block is predicted using inter and intra prediction. FIG. 5 illustrates a new bi- 
prediction mode type, herein called the intra-inter hybrid coding mode, distinguished 
from the intra-only (FIGs. 1 and 2) and Inter-only (FIGs. 3 and 4) prediction modes of 
the related art, which unlike the related art, can combine both spatial ("A" through "M", 

30 301) and temporal (MV, 302) predictions to bi-predictively encode the current 
macroblock or current subblock 110. This new bi-predictive (or multi-predictive) mode, 
provides that two (or more) predictions, which may include one or more intra 
predictions, are to be used (combined) for making the final prediction of a given block 
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or macroblock. Bi-prediction may be used also In l-plctures, with combined Intra- 
Intra predictions. These two Intra predictions could use two different Intra prediction 
directions. 

The disclosed hybrid bi-predictive coding mode allows both Intraframe (intra) 
and interframe (inter) predictions to be combined together (e.g., averaged, or 
weighted) for encoding a current macroblock, sub-macroblock, or partition. In the 
accordance with the principles of the present invention, the related art's method of 
combining predictions (bi-predlction or multi-prediction) is extended by providing for 
the combination of intra predictions and inter predictions for the encoding of a given 
macroblock, sub-macroblock, or partition. The combination of intra and Inter 
predictions allows for improved gain and/or encoding efficiency and/or may reduce 
video data error propagation. 

Embodiments of the invention provide several new macroblock modes/types 
that can be Integrated within a video encoder and decoder and could further improve 
performance as compared with existing architectures. The new macroblock modes 
are similar to the bl-predlctive (or multi-predictive) macroblock modes already used in 
several encoding architectures and standards such as MPEG-2 and H.264, In the 
sense that they use two or more predictions for each macroblock or sub-block, but 
they differ in the sense that they can also use (or only use) intraframe (spatial) 
prediction, as contrasted with conventional inter-only (temporal) bi-prediction. It is 
possible for example, that the combination of two different intra predictions or the 
combined usage of inter and intra predictions would give a better prediction for a 
given macroblock, while It could also be beneficial In reducing blocking artifacts given 
that adjacent spatial samples may be considered during the performance of the 
disclosed bi-prediction coding method. The disclosed method of combining intra 
and inter predictions for coding the same macroblock or subblock can lead to higher 
performance because a) either prediction may contain important distinct information 
that is not preserved if only a single prediction is used, b) either picture may contain 
different encoding artifacts that can be reduced through averaging, or weighting, c) 
averaging functions as a noise reduction mechanism etc. 

Additionally, the disclosed bi-predictive (or multi-predictive) macroblock 
coding mode supports inter prediction modes that are not constrained to use the 
same partition types, and allows the use of all possible combinations of intra and 
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single-list inter types that are defined in Tables 1 tlirougli 3. Tine disclosed bl- 
predictiye (or multi-predictive) macroblock coding mode supports Inter and Intra 
predictions to be performed based upon different partitions of the same macroblock 
to be coded. For example, if only up to two (bi) predictions per macroblock are 
allowed for a hybrid intra-! nter coded macroblock, the first prediction could be 
Intra4x4.(mb_type 0 in Table 1) while the second prediction could be the 16x8 list 1 
block prediction (mb_type 6 In Table 3); 

Syntax and submode types 

The prediction type(s) employed in hybrid-encoding each macroblock are 
signaled vyithin the bitstream, either in a combined fomn (like the form used for B 
slices in H.264) or separately (i.e., using a tree structure). Optionally, the number of 
predictions (e.g., 1, 2 or more) could also be signaled within the bit stream. The 
combined signaling method employed in the related art would necessitate the 
enumeration of all possible or the most likely combinations of predtotion type(s) and 
may not result In the highest compression gain. The compression gain can be 
optimized by employing a separate tree-structured architecture, signaling separately 
each prediction mode. This method allows the use of all possible combinations of 
Intra and single-list Inter types that are defined in Tables 1 through 3, while keeping 
syntax simple. For example, If only up to two (bi) predictions per macroblock are 
allowed for a hybrid-coded macroblock the first prediction could be intra4x4 (mb_type 
0 in Table 1) while the second prediction could be the 16x8 list 1 block prediction 
(mb_type 6 in Table 3). For these additional submodes their associated parameters 
also need to be transmitted, such as the intra direction and/or the associated 
reference indices and motion vectors. This approach allows various combinations, 
such as both/all predictions being intra but having different directions, or being 
different-list predictions, or using different block partitions. 

it may also be preferable to make adjustments and extensions to the 
submodes provided In the H. 264 standard, since (a) some combinations are identical, 
and (b) it may be desirable in some cases to use a single prediction for a macroblock. 
For example, for case (a), we may disallow identical prediction modes and 
automatically adjust the submode types, while for case (b) we can Introduce the 
following additional modes, which define a new Null block prediction type that implies 
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Table 4 - Null SubMacroblock types 



BjNrull_LO_16x8 


2 


Null 


Pred^LO 


16 


8 


B_Null_LO_8xl6 


2 


Null 


Pred_LO 


. 8 


16 


B_NuU_Ll_16x8 


2 


Null 


PredJ.1 


16 


8 


B_NullJLl_8xl6 


2 


Null 


Pred_Ll 


8 


16 


BJL0_NuU_16x8 


2 


Pred^LO 


Null 


16 


8 


B_L0_NulL8xl6 


2 
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5 • A combination of two null prediction block types for the same sub partition (e.g. 
B_Null_L0_8x16 and B_Null_L1_8x16) is forbidden, which again implies that 
adapting the related art's mbjtype table would provide further advantages. A similar 
extension could be made for 8x8 subblocks/partitlons. Because all bl-predictlve 
modes defined In Table 3 can be supported by the disclosed hybrid mode, they could 

10 be eliminated as redundant. 

Extending Direct mode with hvbrid intra-inter bi-orediction 
The spatial Direct mode used In H.264 can be extended with an hybrid intra- 
inter bl-prediction mode embodiment of the invention. Currently, the motion vectors 
15 for a Direct mode blocl< are determined based on the median of the motion vectors of 
the three neighboring blocks. 

If the bi-predictlon mode of at least one neighbor is hybrid (inter-intra), then 
the prediction mode of the current (Direct mode) block could also be hybrid (inter- 
intra). This method of prediction could be restricted according to the availability of 
20 prediction lists utilized by the neighboring blocks. For example if both lists are 
available in the spatial neighbors, then the motion vectors of a Direct mode block are 
calculated again using median prediction regardless if one of the neighbors utilizes 
hybrid prediction. On the otiier hand, if only one list is available (e.g. Iist_1). while one 
of the neighbors utilizes hybrid prediction, then the Direct mode block will be 
25 predicted with hybrid prediction as well, while also utilizing the same available list. 
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Further, If more than one adjacent blocks utilize hybrid prediction, then some simple 
rules can be defined regarding the intra mode that is to be used for the prediction. 
For example, Intra16x16 should for this case supercede Intra4x4 due to its simplicity 
for the prediction. Intra4x4 could also be disallowed within Direct mode bloci<s, but if 
used, then every 4x4 biocl^ direction may be predicted initially using the extemal 
macroblocl< predictions if available. If no external Intra4x4 macroblock is available, 
then the lowest intra mbjype available from the adjacent blocks is used. In general, 
if the prediction mode of more than one of the spatially neighboring blocks is Intra, 
then the lowest order intra prediction is preferably used, while intra is preferably not 
used if both lists are available. Intra Is also not used if the samples required for the 
prediction are not available. 

In-LooD Deblockino Filter 

Video encoders in accordance with the H.264/AVC standard may employ an 
in-ioop deblocking filter to increase the correlation between adjacent pixels mainly at 
block and macroblock MB edges and to reduce the blockiness (blocking artifacts) 
introduced in a decoded picture. The filtered decoded pictures are used to predict the 
motion for other pictures. The deblocking filter is an adaptive filter that adjusts its 
strength depending upon compression mode of a macroblock (Intra or Inter), the 
quantization parameter, motion vector, firame or field coding decision and the pixel 
values. For smaller quantization sizes the filter shuts Itself off. This filter can also be 
shut-off explicitly by an encoder at the slice level. Different strengths and methods 
for the deblocking filter are employed depending on the adjacent block coding types, 
motion and transmitted residual. By taking advantage of the features of an 
embodiment of the cun-ent invention, since the hybrid intra-inter macroblock type 
already contains an intra predictor that considers adjacent pixels during its encoding, 
the strength of the deblocking filter may be modified accordingly. For example, If a 
block is hybrid coded and ite prediction mode uses two (or more) predictions, of 
which one is intra, and has additional coefficients, then the filter strength for the 
con-esponding edges can be reduced (e.g., by one). 

FIG 6 shows a block schematic diagram illustrating a video encoder 604 and 
a video decoder 605 In accordance with an embodiment of the present Invention. 
Encoder 604 receives data about an image sequence from a video source 602, e.g.. 
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a camera. After compressing and encoding the data from the camera in accordance 
with the methods herein disclosed, the encoder 604 passes the information to a 
transmission system 606. Transmitter 606 transmits the bitstream containing the 
hybrid-encoded macrobloclcs over a channel medium 611. The transmission channel 

5 medium 611 may be via wireless, cable, or any other transmission or routing scheme. 
The circuitry of either encoder 604 or decoder 605 may, for example, torn part of a 
mobile or a portable two-way radio, or a mobile phone. 

The moving Image decoding apparatus (decoder 605) on the decoding side 
of the channel medium 611 receives the bitstream in a bitstream buffer 631, and 

10 parses the bitstream (with bitstream parser-processor 633) and decodes the encoded 
Image data from a Intra coded frame. A. stores the decoded image data of the frame A 
In a frame memory 635. Upon reception of inter encoded difference data, the 
decoding apparatus (decoder 605) generates a motion compensation predicted 
image from the decoded data of the frame A. If there is no error in the received data 

15 of the frame A. since the decoded image of the frame A matches the local decoded 
image of the frame A on the encoding apparatus (encoder 604) side, the motion 
compensation predicted image generated from this decoded data matches the motion 
cornpensation predicted image on the encoding apparatus (encoder 604) side. Since 
the encoding apparatus (encoder 604) sends out the difference Image between the 

20 original Image and the motion compensation predicted Image, the decoding 
apparatus can generate a decoded image of the frame B by adding the motion 
compensation predicted image to the received difference Image. If the data of the 
frame A received by the decoding apparatus contains an error, a correct decoded 
image of the frame A cannot be generated. As a consequence, all images generated 

25 from the error-containing portion of the motion compensation predicted image 
become erroneous data. These erroneous data remain until refresh processing is 
performed by intra-encoding. 

When the decoder 605 receives hybrid-encoded Bi-predictive data, hybrid- 
coded block data containing Intra coded Infonnation can be employed at the decoder 

30 to prevent or an-est the propagation of Image data en-ors. Some portions or areas of 
or objects within an Image or sequence of images may be Identifiable by the user or 
by the encoding apparatus as being more important or more susceptible of emsrs 
than others areas. Thus, the encoding apparatus (encoder 604) may be adapted to 
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selectively hybrid-encode the macroblqcks In the more-important areas or objects in a 
frame sequence, so that image data error propagation in those areas is prevented or 
reduced. 

FIG. 7 depicts an exemplary encoder, according to an embodiment of the 

5 invention, indicated generally by the reference numeral 700. The encoder 700 
includes, a video input terminal 712 that is. coupled in signal communication to a 
positive Input of a summing block 714. The summing block 714 is coupled, In turn, 
to a function block 716 for Implementing an integer transfonn to provide coefficients. 
The function block 716 is coupled to an entropy coding block 718 for Implementing 

10 entropy coding to provide an output bitstream. The function block 716 is further 
coupled to an In-loop portion 720 at a scaling and inverse transform block 722. The 
function block 722 is coupled to a summing block 724, which. In turn, is coupled to an 
intra-frame prediction block 726. The intra-frame prediction block 726 is a first Input 
of a combining unit 727, the output of which is coupled to a second input of the 

15 summing block 724 and to an Inverting input of the summing block 714. 

The output of the summing block 724 is coupled to a deblocking filter 740. 
. The deblocking' filter 740 is coupled to a frame store 728. The frame store 728 Is 
coupled to a motion compensation (Inter-frame prediction) block 730, which Is 
coupled to a second input of the combining unit 727. . 

20 The combining unit 727 combines a first (Intra) prediction, from the Intra- 

frame prediction block 726, with a second (Inter) prediction from the motion 
compensation (inter-frame prediction) block 730, to output a resulting combined 
(hybrid intra-lnter) prediction to the second input of the summing block 724 and to an 
inverting Input of the summing block 714. In some embodiments of the invention, 

25 the combining unit 727 may be implemented as a summing block (e.g., similar to 
summing block 724 or 714) operatively coupled to one or more gain blocks (See FIG 
9A), to produce either an "average" of the input intra and inter predictions, or a 
difi'erently weighted combination of the intra and Inter predictions. In other 
embodiments of the invention, the combining unit 727 may be implemented as a 

30 sequential adder circuit adapted to combine a first intra prediction from the Intra- 
frame prediction block 726 with a second (e.g., subsequent) intra prediction from the 
intra-firame prediction block 726; and further adapted to combine a first (Intra) 
prediction from tiie intra-firame prediction block 726 with a second (inter) prediction 
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from the motion compensation (Inter-frame prediction) blocl< 730. 

The video input terminal 712 Is further coupled to a motion estimation block 
719 to provide motion vectors. The deblocking filter 740 is coupled to a second 
input of the motion estimation (Inter-frame prediction) block 719. The output of the 
motion estimation block 719 Is coupled to the motion compensation (Inter-frame 
prediction) block 730 as well as to a second input of the entropy coding block 718. 

The video Input terminal 712 is further coupled to a coder control block 760. 
The coder control block 760 is coupled to control Inputs of each of the blocks 716, 
718, 719, 722, 726, 730, and 740 for providing control signals to control the operation 
of the encoder 700. 

FIG 8, depicts an exemplary decoder, according to an embodiment of the 
invention, indicated generally by the reference numeral 800. The decoder 800 
includes an entropy decoding block 810 for receiving an input bitstream. The 
decoding block 810 Is coupled for providing coefficients to an in-loop portion 820 at a 
scaling and inverse transform block 822. The Inverse transform block 822 is 
coupled to a summing block 824, which. In turn, is coupled to an Intra-frame 
prediction block 826. The Intra-frame prediction block 826 is coupled to a first input 
of a combining unit 827, the output of which Is coupled to a second Input of the 
summing block 824. 

The output of the summing block 824 Is coupled to a deblocking filter 840 for 
providing output images. The deblocking filter 840 Is coupled to a frame store 828. 
The frame store 828 Is coupled to a motion compensation (inter-frame prediction) 
block 830, which Is coupled to a second Input of the combining unit 827. The 
decoding block 810 is further coupled for providing motion vectors to a second input 
of the motion compensation (Inter-frame prediction) block 830. 

The decoder combining unit 827 Is similar in function to the combining unit 
727 in the encoder of FIG. 7 in that it combines a first (intra) prediction, from the Intra- 
frame prediction block 826, with a second (Inter) prediction from the motion 
compensation (inter-frame prediction) block 830, to output a resulting combined 
(hybrid Intra-lnter) prediction to the second input of the summing block 824. In some 
embodiments of the Invention, the combining unit 827 may be implemented as a 
summing block (e.g., similar to summing block 824) operatlvely coupled to one or 
more gain blocks (See FIG 9A). to produce either an "average" of the input intra and 
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Inter predictions, or a differently weighted combination of the intra and inter 
predictions." In other embodiments of the invention, the combining unit 827 may be 
implemented as a sequential adder circuit adapted to combine a first intra prediction 
from the Intra-frame prediction blocl< 826 with a second (e.g., subsequent) intra 

5 prediction from the intra-frame prediction block 826; and further adapted to combine 
a first (intra) prediction from the Intra-frame prediction block 826 with a second (inter) 
prediction from the motion compensation (lnter-fren;te prediction) block 830. 

The entropy decoding block 810 Is further coupled for providing Input to a 
decoder control block 862. The decoder control block 862 Is coupled to control 

10 Inputs of each of the blocks 822, 826, 830, and 840 for communicating control signals 
and controlling the operation of the decoder 800. 

FIGs. 9A and 9B are block schematic diagrams each illustrating an 
exemplary embodiment of the combining unit (e.g., 727 or 827) in the encoder of FIG 
7 or In tlie decoder of FIG 8, comprising circuits for additively combining a first and 

15 second prediction, e.g., additively combining an intra and an inter prediction. FIG. 
9A depicts an exemplary combining unit x27-a (e.g., for implementing combining unit 
727 and 827) including a adder circuit (denoted by the Sigma signal) A27 being 
adapted to combine a first intra prediction from the coupled Intira-frame prediction 
block (e.g., 726 or 826) with a second (e.g., subsequent) Intra prediction from tiie 

20 Inti-a-frame prediction block (e.g., 726 or 826); and being further adapted to combine 
a first (Intra) prediction from the intra-frame prediction block (e.g., 726 or 826) with a 
second (inter) prediction from the motion compensation (Inter-freme prediction) block 
(e.g., 730 or 830). Digital Gain Blocks G1, G2. and G3 provide for weighting, (or 
simply averaging), of the plurality (e.g., two) predictions to be combined. Persons 

25 skilled in the art will recognize that in altemative embodiments of the invention, fewer 
than three (e.g. one or two) digital gain blocks may be provided to combine two 
predictions, for example, as depicted in FIG. 98. 

FIG. 98 depicts an exemplary combining unit x27-b (e.g., 727 and 827) 
Including a adder circuit (denoted by the Sigma signal) A27 being adapted to average 

30 a first Inti'a prediction from the Intra-frame prediction block (e.g., 726 or 826) with a 
second (e.g., subsequent) Intira prediction from the Intra-frame prediction block (e.g., 
726 or 826); and being further adapted to average a first (Intira) prediction from the 
Intra-frame prediction block (e.g., 726 or 826) with a second (Inter) prediction from the 
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motion compensation (inter-frame prediction) blocl< (e.g., 730 or 830). Digital Gain 
Block G3 being fixed at a gain value of one half (1/2) provides for dividing (averaging) 
the sum of two predictions output from the adder circuit (denoted by the Sigma 
signal) A27. 

5 Various aspects of the present invention can be Implemented in software, 

which may be run In a general purpose computer or any other suitable computing 
environment. The present invention Is operable In a number general purpose or 
special purpose computing environments such as personal computers, general- 
purpose computers, server computers, hand-held devices, laptop devices. 

10 multiprocessors, microprocessors, set top boxes, programmable consumer 
electronics, network PCs, minicomputers, mainframe computers, distributed 
computing environments and the like to execute computer-executable instructions for 
performing a frame-to-frame digital video encoding of the present invention, which is 
stored on a computer readable medium.. The present Invention may be implemented 

15 in part or in whole as domputer-executable Instructions, such as program modules 
that are executed by a computer. Generally, program modules Include routines, 
programs, objecte, components, data structures and the like to perform particular 
tasks or to Implement particular abstract data types. In a distributed computing 
environment, program modules may be located In local or remote storage devices. 

20 Exemplary embodiments of the Invention have been explained above and are 

shown In the figures. However, the present invention is not limited to the exemplary 
embodiments described above, and it is apparent that variations and modifications 
can be effected by those skilled in the art within the spirit and scope of the present 
invention. Therefore, the exemplary embodiments should be understood not as 

25 limitations but as examples. The scope of the present Invention Is not determined 
by the above description but by the accompanying claims and variations and 
modifications may be made to the embodiments of the invention without departing 
from the scope of the Invention as defined by the appended claims and equivalents. 



